Add option to drop partial buckets from date_histogram visuals #19979

blfrantz · 2018-06-18T01:48:55Z

Closes #2806.

Note: description has changed since the original version to reflect a new implementation.

Date Histograms often begin and/or end with incomplete buckets, which makes the beginning/end of time-series charts appear to show steep up/down trends which can be misleading or alarming. This feature adds a new "Drop partial buckets" option for the Date Histogram aggregation. It only appears/applies when the chosen field is the same as the index's Time Filter field (that's the only case where this feature makes sense).

When selected, any buckets which span more time than is covered by the query's Time Range will be removed from the chart.

elasticmachine · 2018-06-18T01:48:57Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

ppisljar · 2018-06-18T08:10:37Z

thanks a lot for your contribution @blfrantz !

As you mention this is something aggregation specific. Adding it to handler.js means it will only work with pie, area, line, bar, heatmap and gauge charts. It won't with all the others (maps, metric, tagcloud, more to come, any 3rd party visualizations).

I would suggest implementing this as an aggregation option. The date histogram aggregation config and logic is in ui/agg_types/buckets/date_histogram.js. This way this option can be completely independent from the visualization.

The actual conversion of data happens in tabify response handler, this is probably where you should check for this option and skip partial buckets. ui/vis/response_handlers/tabify.js and basic.js in same folder use the tabify functions defined in ui/agg_response/tabify.

let me know what you think and if this is helpful.

blfrantz · 2018-06-18T13:24:22Z

Thanks @ppisljar! Sounds like good advice. I like the idea of putting this with the date_histogram aggregation instead of the axis config. I was thinking this was more of an axis display option which led me to my implementation, but yours should be a lot cleaner. I'll work on this and update the PR shortly (doing this in my free time so it may be a few days).

blfrantz · 2018-06-23T00:22:11Z

Hey @ppisljar, now that I've gotten into it I have a question: is it possible to get the time range covered by the query from within the response_handlers? It seems like this information should be knowable since the exact timestamps are specified in the "range" portion of the _msearch query, but I can't find where to get this from the vis object. My issue is that without knowing the range of time included in the query, I can't know for sure which buckets are indeed "partial." I don't want to just assume the first/last ones are and drop them every time since absolute time ranges could be expected to align with the buckets and not have any partials.

The nice thing about my first approach, ugly as it was, is it had access to this range info and could use the same logic as the "this area contains partial data" tooltip (which incidentally also made it easy to extend that tooltip to say something more relevant in this case).

Any suggestions? I do like the genericism of your suggestion, but it's not clear to me where to get this range information.

Edit: One other nice thing about editing the data in handler.js and treating this as a view option rather than an aggregation option, is we don't need to requery when toggling this setting. But that's minor.

blfrantz · 2018-06-27T23:11:36Z

@ppisljar, any thoughts? Thanks!

ppisljar · 2018-06-28T06:43:26Z

sorry for late reply.

in your response handler you can access the timefilter on vis.filters.timeRange

preferably you would be able to figure this out of elastic search response, but i am not sure if that is possible.

blfrantz · 2018-06-28T23:23:00Z

No problem, @ppisljar .

So in the relative search case, timeRange only contains something like {from: "now-15m", to: "now", mode: "quick"}. Without knowing when the query fired, that's not really useful (and even if I did, it'd be ugly to calculate). As for the ES response, the data returned by the _msearch query doesn't appear to have any timestamp info aside from the bucket times, but those don't necessarily align with the queried range (if they did, this whole feature would be unnecessary).

Unless this information is available somewhere I'm not aware of (or not that hard to pass in), I'm beginning to think my original approach may be more realistic. In the same way that the "partial data" tooltip only shows up for some types of visuals (not all where it could be relevant), I'm wondering if having this feature exist only for some of the relevant cases (arguably the ones that matter most) is enough for now? What do you think? If you're ok with that approach, can you provide feedback on my 4th question (about test frameworks)?

Thanks!

ppisljar · 2018-06-29T05:15:51Z

import { getTime } from 'ui/timefilter/get_time';

const parsedTime = getTime(vis.indexPattern, vis.filters.timeRange);

let me know if this helps

blfrantz · 2018-07-01T18:33:00Z

@ppisljar thanks for that tip, it worked great. I've updated the PR with the new implementation, let me know what you think.

ppisljar

thanks @blfrantz !

i think this will work much better (specially with our future plans).

i have one more suggestion. Generally i think we are always talking about first and last bucket. Would it make sense to let tabify do its job (convert to table) and then check just some rows and remove those in something like a postprocessing step on tabify ? this way we wouldn't need to pass time information all the way down to TabifyBuckets and we could keep this code more isolated.

ppisljar · 2018-07-03T09:38:53Z

src/ui/public/agg_response/tabify/_buckets.js

@@ -81,4 +82,20 @@ TabifyBuckets.prototype._orderBucketsAccordingToParams = function (params) {
  }
 };

+TabifyBuckets.prototype._dropPartials = function (params, timeRange) {
+  if (params.drop_partials && !this.objectMode && this.buckets.length > 1) {


rather than wrapping the whole thing in an if statement return early:

if (!params.drop_partials || this.objectMode || this.buckets.length <= 1) { return; }

maybe move the if (params.drop_partials) out of this function ?

also join with below if statement ... and extract to variables so its easier to read:

const isTimeField = params.field.name === timeRange.name; if (this.objectMode || this.buckets.length <= 1 || !isTimeField) { return; }

blfrantz · 2018-07-05T00:29:28Z

Thanks @ppisljar. That makes sense, and does make the code organization more elegant/intuitive. However I'm finding that it makes the dropPartials function itself rather more complicated and therefore more brittle.

Because of the way the tabified data is structured, things that were simple become a bit complicated in the proposed approach.

In the no-split-charts case, the following works (called with dropPartialRows(write.response(), write.timeRange) toward the end of tabifyAggResponse() in tabify.js:

function dropPartialRows(tableGroup, timeRange) {
  tableGroup.tables.forEach(table => {
    if (table.rows.length < 2) return;

    const aggIndex = _.findIndex(table.columns, {
      'aggConfig': {
        '_opts': {
          'type': 'date_histogram',
          'params': {
            'field': timeRange.name,
            'drop_partials': true
          }
        }
      }
    });

    if (aggIndex !== -1) {
      const time0 = table.rows[0][aggIndex].key;
      const time1 = _.find(table.rows, r => {
        if (r[aggIndex].key !== time0) return true;
        else return false;
      })[aggIndex].key;

      const interval = time1 - time0;

      table.rows = table.rows.filter(arr => {
        if (arr[aggIndex].key < timeRange.gte) return false;
        if (arr[aggIndex].key + interval > timeRange.lte) return false;
        return true;
      });
    }
  });
}

That's not too bad, but some things that could be simple (like determining the interval) get a bit complicated because tabify's rows array contains a separate row for every value on each series, which means there can be multiple rows per time bucket, which means I have to scan the array until I find a new time to determine the interval. Previously, I could just compare the first and second items in the array.

Furthermore, the above code doesn't handle the split charts case (whereas the current PR code does), because in that event tableGroup contains tables of tables, and so I'd need to process this recursively or something. That's not a big deal, but as this gets more complex (and computationally expensive) I wanted to see if you thought this was still the best approach. It also seems harder to test. Apologies if I'm missing something. Thanks!

ppisljar

You are right, the implementation as it is in the current PR seems to be way simpler.

i left some nitty picking comments, feel free to ignore them.

@timroes can you also take a look ?

timroes · 2018-07-09T12:25:06Z

src/ui/public/agg_types/controls/drop_partials.html

+    &nbsp;
+    <icon-tip
+      position="'right'"
+      content="'Removes buckets that include times not covered by the Time Range.'"


@gchaps What would be the best wording we could use here to describe the "Drop partial buckets". Besides the above I thought about something like:

Removes buckets from the beginning and end, that are partially outside the visualization's time range.

@timroes

Maybe something like this:

Remove buckets that span time outside the time range so the histogram doesn't start and end with incomplete buckets.

timroes · 2018-07-09T12:41:20Z

src/ui/public/vis/response_handlers/basic.js

@@ -74,7 +75,8 @@ const BasicResponseHandlerProvider = function (Private) {
        const tableGroup = aggResponse.tabify(vis.getAggConfig().getResponseAggs(), response, {
          canSplit: true,
          asAggConfigResults: true,
-          isHierarchical: vis.isHierarchical()
+          isHierarchical: vis.isHierarchical(),
+          timeRange: getTime(vis.indexPattern, vis.filters.timeRange).range


@ppisljar Since we talked earlier about removing vis.filters, should we introduce a new reference here? Do we have an idea how we remove that in the future?

vis.filters is going away in the long run, but the response handlers will still get the access to the timeRange (probably being passed in directly as a parameter in the future) so this should not be a problem.

@blfrantz If you create a visualization without a time field you will get an error for this line because the getTime will return undefined if not time field is specified on the selected indexPattern.

You can verify the error using the shakespeare dataset https://www.elastic.co/guide/en/kibana/current/tutorial-load-dataset.html and just creating a barchart with that index.

blfrantz · 2018-07-09T23:31:38Z

@ppisljar Thanks, I've cleaned up the conditionals for _dropPartials, let me know how it looks now. I still need to rebase/retest before this is ready to pull but waiting to see if any further tweaks are needed per @timroes comments.

blfrantz · 2018-07-12T23:35:46Z

@ppisljar & @timroes: I just force-pushed a cleanup of the commit history to remove all the old iterations. As part of this I added the new tooltip text recommended by @gchaps. I also just rebased onto the latest upstream, fixed a couple of my tests that were broken by some recent upstream changes, and sanity tested that the feature still works as expected. Awaiting any further instructions. Thanks!

blfrantz · 2018-07-23T15:38:05Z

@ppisljar & @timroes: I know you guys are busy, but just wanted to make sure this was still on your radar. Thanks!

timroes · 2018-07-23T15:40:54Z

Jenkins, test this

elasticmachine · 2018-07-23T17:34:01Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

blfrantz · 2018-08-22T02:47:43Z

@markov00 Thanks for the feedback and tip about that test failure. I've merged and made the requested fixes. I also confirmed that the failing Shakespeare test passes locally now.

markov00 · 2018-08-22T07:40:46Z

jenkins test this

elasticmachine · 2018-08-22T08:12:34Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

markov00 · 2018-08-22T09:21:35Z

jenkins test this

elasticmachine · 2018-08-22T11:07:40Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

markov00 · 2018-08-22T12:09:59Z

jenkins test this, want to be sure it was a flaky CI test failure

elasticmachine · 2018-08-22T13:54:51Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

blfrantz · 2018-08-30T04:03:04Z

What's the next step here?

timroes · 2018-08-30T07:09:58Z

Jenkins, test this - I'll give it a last test run, then will merge it.

elasticmachine · 2018-08-30T08:54:22Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

…ic#19979) * Add Drop Partials option to date histogram agg settings UI * Add timeRange to aggOpts and parse in _response_writer * Implement dropPartials method in TabifyBuckets * Fixed a couple issues * Fixed issue with undefined timeRange * Use braces for conditionals

timroes · 2018-08-30T09:10:45Z

Hi Brian,

thanks a lot for your first contribution to Kibana! I just merged your commit on master and it will be backported to 6.x, so that these changes will be released in 6.5.

Cheers
Tim

… (#22528) * Add Drop Partials option to date histogram agg settings UI * Add timeRange to aggOpts and parse in _response_writer * Implement dropPartials method in TabifyBuckets * Fixed a couple issues * Fixed issue with undefined timeRange * Use braces for conditionals

blfrantz · 2018-08-30T11:07:52Z

Awesome, thanks everyone!

LeeDr · 2018-11-10T00:15:21Z

@timroes or @markov00 or @ppisljar can any of you help me understand why I don't see this option here on this 6.5.0 visualization? This one is using ecommerce sample data, and I also tried a new line chart from makelogs data.

ppisljar · 2018-11-12T08:52:11Z

@LeeDr #25520

elasticmachine · 2018-11-12T09:01:24Z

Pinging @elastic/kibana-app

Benny-Git · 2018-12-07T09:07:03Z

We just updated to 6.5.2, and I still don't see the checkbox option, even for a newly created visualization.
According to the release notes, #25520 should be included in 6.5.2?

timroes · 2018-12-07T09:18:36Z

@Benny-Git unfortunately that PR was missing the backport to 6.5 😞 This should be released in a future 6.5 patch release instead, sorry.

@gchaps Could you perhaps remove Fixes option for showing partial buckets #25520 from the 6.5.2 release notes?

Benny-Git · 2018-12-07T09:30:22Z

@timroes thanks for the very quick response.
I'll just manually update my few relevant saved objects with "drop_partials": true for now :)

gchaps · 2018-12-07T22:33:00Z

@timroes Done.

timroes added Feature:Visualizations Generic visualization features (in case no more specific feature label is available) Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) labels Jun 18, 2018

timroes requested review from ppisljar and timroes June 18, 2018 07:01

blfrantz changed the title ~~Add option to drop partial buckets from data_histogram visuals~~ Add option to drop partial buckets from date_histogram visuals Jul 1, 2018

ppisljar requested changes Jul 3, 2018

View reviewed changes

ppisljar approved these changes Jul 9, 2018

View reviewed changes

timroes reviewed Jul 9, 2018

View reviewed changes

blfrantz added 4 commits July 12, 2018 17:47

Add Drop Partials option to date histogram agg settings UI

e782505

Add timeRange to aggOpts and parse in _response_writer

43a9eb2

Implement dropPartials method in TabifyBuckets

5f1fcf2

Fixed a couple issues

841da96

blfrantz force-pushed the master branch from 4318c4a to 841da96 Compare July 12, 2018 23:28

Benny-Git mentioned this pull request Jul 24, 2018

line chart - option to exclude last point with partial data #9062

Closed

blfrantz added 2 commits August 21, 2018 21:44

Fixed issue with undefined timeRange

d387d4c

Use braces for conditionals

53b43f9

markov00 approved these changes Aug 22, 2018

View reviewed changes

timroes merged commit 25761fb into elastic:master Aug 30, 2018

timroes added v7.0.0 v6.5.0 labels Aug 30, 2018

timroes mentioned this pull request Aug 30, 2018

[6.x] Add option to drop partial buckets from date_histogram visuals (#19979) #22528

Merged

timroes added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Nov 12, 2018

Add option to drop partial buckets from date_histogram visuals #19979

Add option to drop partial buckets from date_histogram visuals #19979

Conversation

blfrantz commented Jun 18, 2018 • edited Loading

elasticmachine commented Jun 18, 2018

ppisljar commented Jun 18, 2018

blfrantz commented Jun 18, 2018

blfrantz commented Jun 23, 2018 • edited Loading

blfrantz commented Jun 27, 2018

ppisljar commented Jun 28, 2018

blfrantz commented Jun 28, 2018

ppisljar commented Jun 29, 2018

blfrantz commented Jul 1, 2018

ppisljar left a comment

Choose a reason for hiding this comment

ppisljar Jul 3, 2018

Choose a reason for hiding this comment

ppisljar Jul 9, 2018

Choose a reason for hiding this comment

blfrantz commented Jul 5, 2018 • edited Loading

ppisljar left a comment

Choose a reason for hiding this comment

timroes Jul 9, 2018

Choose a reason for hiding this comment

gchaps Jul 10, 2018

Choose a reason for hiding this comment

timroes Jul 9, 2018

Choose a reason for hiding this comment

ppisljar Jul 24, 2018

Choose a reason for hiding this comment

markov00 Aug 21, 2018

Choose a reason for hiding this comment

blfrantz commented Jul 9, 2018

blfrantz commented Jul 12, 2018

blfrantz commented Jul 23, 2018

timroes commented Jul 23, 2018

elasticmachine commented Jul 23, 2018

💔 Build Failed

blfrantz commented Aug 22, 2018

markov00 commented Aug 22, 2018

elasticmachine commented Aug 22, 2018

💔 Build Failed

markov00 commented Aug 22, 2018

elasticmachine commented Aug 22, 2018

💚 Build Succeeded

markov00 commented Aug 22, 2018

elasticmachine commented Aug 22, 2018

💚 Build Succeeded

blfrantz commented Aug 30, 2018

timroes commented Aug 30, 2018

elasticmachine commented Aug 30, 2018

💚 Build Succeeded

timroes commented Aug 30, 2018

blfrantz commented Aug 30, 2018

LeeDr commented Nov 10, 2018

ppisljar commented Nov 12, 2018

elasticmachine commented Nov 12, 2018

Benny-Git commented Dec 7, 2018

timroes commented Dec 7, 2018

Benny-Git commented Dec 7, 2018

gchaps commented Dec 7, 2018

blfrantz commented Jun 18, 2018 •

edited

Loading

blfrantz commented Jun 23, 2018 •

edited

Loading

blfrantz commented Jul 5, 2018 •

edited

Loading